Notes (Week 10 Monday)
These are the notes I wrote during the lecture.
Deal with λ-calculus:
- we have *way* too many slides
- there's no way we'll get through them all
- therefore: whatever we don't to I promise won't be on the exam :)
Once we have the λ-notation for function application,
we don't need to call functions by name anymore.
f(x) = x + 5
I want to write:
f(3)
In the "normal" notation I can't do this without separately
defining the function :(
With λ-notation I can just write:
f = λx. x + 5
I can still write
f(3)
Or (equivalently) I can write:
(λx. x + 5)(3)
^ I can *inline* functions: use them in expressions
without naming them
To evaluate this expression:
(λx. x + 5) 3
Take the body
x+5
and replace all occurences of x by 3
"beta-reduces to" (more later)
(λx. x + 5) 3 ↦β (x+5)[x:=3] = 3 + 5
- Syntax
The λ-calculus terms take three forms:
- variables (similar to atoms in propositional logic,
or to variables in predicate logic)
- function application
- function abstractions
(x,y,z ranges over variables)
(s,t,y ranges over λ-calculus terms)
t ::= x -- variable
| s t -- function application
| λx. t -- function abstraction
^ That's literally the whole syntax (in most presentations)
The slides also include constants as a base case
(will be needed for the HOL material, but we won't get that far)
Some syntactic conventions:
associativity:
function application associates *to the left*
f x y read this as (f x) y
We only have single-argument functions λx. t
But: we can encode multi-argument functions by nesting λs
(λx. λy. x + y) <-- a function that accepts two arguments
and adds them together
^ a function, which (given an argument x)
returns *another function* which given an argument y
adds them together
Currying: the technique of encoding multi-argument functions
by nested function application
def: f(x,y) = x + y
app: f(3,5)
(λx. λy. x + y) 3 5
read as
((λx. λy. x + y) 3) 5
↦β
(λy. 3 + y) 5
↦β
3 + 5
That's why function application is left-associative.
More conventions:
- Nested lambdas are often grouped together
Instead of
λx. λy. x + y ∀x. ∀y. x < y
we often write
λx y. x + y ∀x y. x < y
- The scope of a λ extends as far to the right as possible
λx. f x y is the same thing as
(λx. (f x y))
which is different from
(λx. f x) y
- Substitution
((λx. x + 1) x)[x := 2]
^ bound ^ free
=
(λx. x + 1) 2
λx. is a *binder* for x
in the same way that ∃x or ∀x is
- Semantics (β-reduction)
The semantics is easy in the sense that it's tiny:
whenever you have a β-redex, you can reduce it.
A β-redex is an expression of the following form:
(λx. t) s
To reduce it, replace all occurences of x in t by s.
____________________ (β-rule)
(λx. t) s ↦β t[x:=s]
There are three more (structural) rules.
Morally, they say "you can β-reduce anywhere"
t ↦β t'
_________________________ (app1)
s t ↦β s t'
s ↦β s'
_________________________ (app2)
s t ↦β s' t
s ↦β s'
_________________________ (lam)
(λx. s) ↦β (λx. s')
_____________ β-rule
(λx. x) 3 ↦ 3
_____________________ app1
f ((λx. x) 3) ↦β f 3
____________________________ app2
(f ((λx. x) 3)) 4 ↦ (f 3) 4
Q: If I can β-reduce an expression to another, are they in some sense
equal?
A: We don't consider them *syntactically* equal,
but they would be semantically equal.
In λ-calculus speak, they are:
alpha-beta equivalent.
Two terms are
α-equivalent
t ≡α t'
if they differ only in choice of placeholders for bound names
(λx. x) ≡α (λy. y)
β-equivalence:
the smallest equivalence relation that relates β-redexes
Q: can you write a denotation of λ-terms? ⟦λx. x⟧ : ...
A: yes, but (read some Dana Scott if you really want to know)
^ *way* beyond the scope of the course
- Church encodings (aka how do you program in this stupid thing?)
I won't *prove* that we can do all computable functions in λ-calculus.
Instead I will give you an intution for how you might,
by sketching how to encode some common programming constructs.
The standard Church encodings of
- booleans
- if-then-else
- natural numbers
- recursive functions
For the Church encodings
cheesy slogan: "A Church encoding is what it does"
we're going to encode a data type (bool, nat, ..)
as the thing we typically do to consume it.
Q: What do you do with a nat in a program?
A: Iterate over it (repeat something n times)
therefore: a Church number is a function
that repeats another function n times
Church Numbers Peano
λf x. x 0 Z
λf x. f x 1 S Z
λf x. f (f x) 2 S (S Z)
λf x. f^n x n S^n Z
To illustrate how you can compute on Church numerals,
let's define
- successor function
S = λn. λf x. f(n f x)
(or equivalently)
λn. λf x. n f (f x)
n f x = apply f to x n times
f(n f x) = apply f to x n+1 times
S 0 ≡α
(λn. λf x. f(n f x)) (λg y. y)
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
↦β
λf x. f((λg y. y) f x)
^^^^^^^^^^^
↦β
λf x. f((λy. y) x)
^^^^^^^^^
↦β
λf x. f x
=
1
- addition
λn m. λf x. n f (m f x)
Let's do 1+1
(+) 1 1
=
(λn m. λf x. n f (m f x)) 1 1
↦β
(λm. λf x. 1 f (m f x)) 1
↦β
(λf x. 1 f (1 f x))
=
(λf x. 1 f ((λf x. f x) f x))
↦β
(λf x. 1 f ((λx. f x) x))
↦β
(λf x. 1 f (f x))
=
(λf x. (λf x. f x) f (f x))
^^^^^^^^^^^^^
↦β
(λf x. (λx. f x) (f x))
^^^^^^^^^^^^^^^
↦β
(λf x. f (f x))
=
2
Q: What do you do with a bool in a program?
A: do branching on it
therefore: a Church boolean is going to be
a function that chooses between two options
λx y. x true
λx y. y false
negation ¬b
λb. λx y. b y x
or equivalently:
λb. b false true
(¬) (true)
=
(λb. λx y. b y x) (λz w. z)
↦β
(λx y. (λz w. z) y x)
^^^^^^^^^^^
↦β
(λx y. (λw. y) x)
↦β
(λx y. y)
=
false
if-then-else
if b then c else d <-- your favourite language
b c d
It's not obvious at first glance that you could write a λ-term
whose evaluation doesn't terminate.
But we do have self-application
(λx. x x) (λx. x x)
↦β
x x[x:=(λx. x x)]
(λx. x x) (λx. x x)
^ basically the λ equivalent of while(true) do skip;
or f(x) = return f(x)
^ this is an example of a λ-calculus term with *no normal form*
A *normal form* is a λ-term with no β-redexes
- Fix-point combinators
^ potentially diverging λ-terms that do something interesting
Y combinator is a
λ-term that satisfies the following equation:
Y f ≡αβη f(Y f)
^ this equation characterises iteration or recursion
while true do P ≡ P; while true do P
(we don't talk about η)
Y = (λf. (λx. f (x x)) (λx. f(x x)))
Y g
= (λf. (λx. f (x x)) (λx. f(x x))) g
↦β (λx. g (x x)) (λx. g(x x))
↦β g ((λx. g(x x)) (λx. g(x x)))
^ does this really give us recursion?
f(x) =
if x = 0 then
return 3
else
return f(x-1)
// x: the argument to f above
// f: a function that behaves as the f above
Y (λf x. (x = 0) 3 (f(x-1)))
≡β
(λx. (x = 0) 3 ((Y (λf x. (x = 0) 3 (f(x-1))))(x-1)))
- Confluence
A property that all λ-calculus terms have:
intuitively: a term cannot have two different normal forms
(λx. x) ((λy. y) z) <--- has two β-redexes
(λx. x) ((λy. y) z) ↦β (λy. y) z ↦β z
(λx. x) ((λy. y) z) ↦β (λx. x) z ↦β z
Confluence (aka the Church-Rosser theorem) is the property that
the choice doesn't matter, in the sense that you'll get on
different paths initially but you can always get back on
the same path
s
/ \
β / \ β
/ \
t u
\ /
β*\ / β*
\ /
v
Formally:
If s ↦β t and s ↦β u,
then there exists v such that
t ↦β* v and s ↦β* v.
Intuitively: any choice of eval order yields the same final result.
You can usually encode loops as tail-recursive functions
while(x ≠ 0)
x := x - 1;
y := x * y;
f(x,y) =
if x = 0 then
return y
else
return f(x-1,x*y)
Final point: why λ-calculus (out of the box)
fails as a foundation of mathematics.
Y (¬)
≡
(¬) (Y (¬))
^ in words:
there exists a term that's equal to its own negation
λa b. a a b
λa b. a b a